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Abstract 

With  the  rapid  growth  of  the  World  Wide  Web,  clients  at¬ 
tempting  to  access  some  popular  web  sites  are  experiencing  slow 
response  times  due  to  server  load  and  network  congestion.  Re¬ 
placing  the  single  server  machine  with  a  set  of  replicated  servers 
is  a  cost-effective  solution  to  partition  server  load  which  also  al¬ 
lows  incremental  scalability  and  fault  transparency.  Distributing 
these  replicated  servers  geographically  can  reduce  network  con¬ 
gestion  and  increase  availability.  However,  distributed  web  sites 
are  faced  with  the  issue  of  allocating  servers:  how  do  clients  find 
out  about  the  replicas  and  how  do  they  decide  which  one  to  con¬ 
tact?  Popular  web  sites  have  well  publicized  server  names  and 
require  a  transparent  mapping  of  the  public  server  name  to  repli¬ 
cated  servers. 

Unlike  most  traditional  approaches,  we  propose  a  technique 
which  pushes  the  server  allocation  functionality  onto  the  client. 
We  argue  that  this  approach  scales  well  and  results  in  increased 
performance  in  many  cases.  Building  on  theoretical  work  based 
on  game  theory,  we  show  that  the  usage  of  individual  replicas  can 
be  effectively  controlled  with  cost  functions  even  when  the  clients 
are  noncooperative.  We  present  the  design  and  implementation  of 
WebSeAl ,  our  prototype  system  realizing  these  techniques.  Web¬ 
SeAl  does  not  require  any  changes  to  existing  client  and  server 
code,  conforms  to  all  standards,  and  does  not  generate  any  con¬ 
trol  messages.  Preliminary  experiments  utilizing  servers  on  six 
continents  and  in  controlled  settings  indicate  that  WebSeAl  im¬ 
proves  performance  significantly  while  imposing  little  overhead. 
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1  Introduction 

The  rapid  growth  of  the  World  Wide  Web  has  led  to 
a  steady  increase  of  client  requests  to  many  popular  web 
sites.  Both  overloaded  servers  and  network  congestion 
contribute  to  slow  response  times  of  such  sites.  It  may 
not  be  cost-effective  to  upgrade  the  server  machine  with 
a  more  powerful  one,  especially  when  incremental  scala¬ 
bility  is  desired.  Instead,  most  sites  opt  to  replace  the  sin¬ 
gle  server  with  a  cluster  of  replicated  servers  [15,  16].  Al¬ 
though  this  may  solve  the  problem  of  overloaded  servers, 
it  does  not  address  network  congestion.  In  addition,  in¬ 
creasing  the  network  capacity  may  not  be  cost-effective 
when  incremental  scalability  is  desired.  Instead,  some 
sites  choose  to  geographically  distribute  the  replicated 
servers — this  approach  has  become  popular  with  software 
archives  (e.g.  [20])  which  have  mirror  sites,  typically  on 
several  continents.  Such  a  distributed  architecture  may 
result  in  increased  availability  of  the  service  in  times  of 
network  congestion  and  partial  unavailability,  and  it  may 
increase  performance  by  taking  advantage  of  “proximity” 
between  clients  and  servers. 

Currently,  distributed  web  sites  require  the  user  to  man¬ 
ually  select  a  server  out  of  a  list  of  replicas.  For  ex¬ 
ample,  there  exist  over  70  mirror  sites  distributed  all 
over  the  world  from  which  users  can  download  Netscape 
browsers  [17];  the  decision  as  to  which  one  to  use  is  left  to 
the  user  however.  Designing  a  transparent  allocation  strat¬ 
egy  for  a  distributed  web  site  which  does  not  sacrifice  any 
of  its  benefits  is  a  challenging  task.  A  successful  solution 
must  meet  several  requirements: 

•  Transparent  Name  Resolving:  Popular  web  sites 
have  well  publicized  server  names  and  require  a  trans¬ 
parent  mapping  to  replicated  servers. 


1 


Report  Documentation  Page 

Form  Approved 

OMB  No.  0704-0188 

Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 

VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  OMB  control  number. 

1.  REPORT  DATE 

2.  REPORT  TYPE 

3.  DATES  COVERED 

4.  TITLE  AND  SUBTITLE 

WebSeAl:  Web  Server  Allocation 

5a.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

6.  AUTHOR(S) 

5d.  PROIECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

Defense  Advanced  Research  Projects  Agency, 3701  North  Fairfax 

Drive, Arlington, VA, 22203-1714 

8.  PERFORMING  ORGANIZATION 

REPORT  NUMBER 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

10.  SPONSOR/MONITOR'S  ACRONYM(S) 

11.  SPONSOR/MONITOR'S  REPORT 
NUMBER(S) 

12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited 

13.  SUPPLEMENTARY  NOTES 

14.  ABSTRACT 

see  report 

15.  SUBIECT  TERMS 

16.  SECURITY  CLASSIFICATION  OF:  17.  LIMITATION  OF 

18.  NUMBER  19a.  NAME  OF 

a.  REPORT  b.  ABSTRACT  c.  THIS  PAGE 

unclassified  unclassified  unclassified 

10 

Standard  Form  298  (Rev.  8-98) 

Prescribed  by  ANSI  Std  Z39-18 


WebSeAl:  Web  Server  Allocation 


2 


•  Scalability:  Server  allocation  should  gracefully  scale 
with  the  increasing  number  of  clients. 

•  Flexibility:  Different  users  may  have  different  objec¬ 
tives  when  accessing  web  sites,  requiring  suppoit  for 
customized  strategies. 

•  Load  Balancing:  Service  providers  should  be  able 
to  effectively  control  the  utilization  of  individual 
servers. 

•  Dynamic  Changes  in  Server  Pool:  Addition,  re¬ 
moval,  and  migration  of  servers  should  be  supported, 
and  changes  should  be  reflected  as  quickly  as  possi¬ 
ble. 

•  Fault  Transparency:  Unresponsive  machines  should 
be  detected  and  requests  transparently  redirected  to 
other  replicas.  Also,  previously  unresponsible  ma¬ 
chines  which  become  available  again  should  be  incor¬ 
porated  quickly. 

•  Geographic  Distribution:  Network  delays  between 
a  client  and  individual  servers  of  a  distributed  service 
might  differ  significantly.  Server  allocation  should 
take  advantage  of  this  while  still  accommodating  dy¬ 
namic  changes  in  network  performance  and  server 
load. 

•  Legacy  Code  and  Standards:  It  should  not  require 
any  changes  to  existing  client  or  server  code  and 
should  conform  to  existing  standards. 

A  comprehensive  solution  for  allocation  of  distributed 
web  servers  must  address  all  these  factors.  We  are  not 
aware  of  any  system  which  achieves  this.  In  this  paper, 
we  present  a  system  called  WebSeAl  which  addresses  these 
issues. 

The  research  leading  to  our  system  is  based  on  theoret¬ 
ical  work  where  provable  methods  for  controlling  network 
load  using  pricing  mechanisms  were  developed.  It  was 
shown  that  even  with  noncooperative  clients  (in  a  fully  dis¬ 
tributed,  and  therefore  scalable  fashion),  the  network  load 
can  be  controlled  effectively.  The  work  presented  in  this 
paper  applies  these  techniques  to  provide  scalable  and  con¬ 
trollable  load  balancing  for  distributed  web  servers. 

The  remainder  of  this  paper  is  structured  as  follows. 
Section  2  gives  an  overview  of  related  work.  Section  3  dis¬ 
cusses  WebSeAl’s  architecture  and  describes  how  clients 


strive  to  minimize  delays.  Section  4  shows  how  load  bal¬ 
ancing  can  be  achieved  by  introducing  cost  functions.  Ex¬ 
perimental  results  showing  WebSeAl's  performance  are 
presented  in  Section  5,  and  Section  6  provides  concluding 
remarks. 

2  Related  Work 

The  HTTP  redirect  [1]  approach  uses  the  HTTP  return 
code  URL  Redirection  [2]  to  perform  load  balancing.  A 
busy  server  returns  the  address  of  another  server  instead  of 
the  actual  response,  asking  the  client  to  resubmit  its  request 
to  that  server.  This  creates  additional  network  traffic  and 
increased  latency.  Every  request  is  initially  addressed  to 
the  publicly  known  server  which  creates  a  single  point  of 
failure  and  the  potential  for  a  bottleneck  due  to  servicing 
redirects. 

Domain  Name  Server  (DNS)  based  approaches  [3,  10, 
5]  perform  load  balancing  at  the  name  resolution  level.  The 
name  server  at  the  server  side  is  modified  to  respond  to 
translation  requests  with  the  IP  numbers  of  different  hosts 
in  a  Round-Robin  fashion.  This  results  in  partitioning 
client  requests  among  the  replicated  hosts.  The  main  disad¬ 
vantage  of  this  approach  is  that  intermediate  name  servers 
and  clients  cache  name-to-IP  mappings  which  can  result  in 
significant  load  imbalance. 

Server  side  approaches  [7,  5]  use  a  server  side  rout¬ 
ing  module  which  redirects  all  incoming  requests  to  a  set 
of  clustered  hosts  based  on  load  characteristics.  This  is 
achieved  at  the  IP  layer — i.e.,  the  routing  module  modifies 
all  IP  packets  before  forwarding  them  to  individual  hosts. 
An  alternate  server  side  solution  which  avoids  modifying 
IP  packets  is  presented  in  [6],  These  approaches  have  the 
drawback  that  the  routing  module  represents  a  single  point 
of  failure,  and  therefore  can  result  in  a  bottleneck  since 
all  requests  pass  through  it.  In  addition,  server  side  ap¬ 
proaches  work  well  only  for  clustered  servers. 

Perhaps  most  closely  related  to  WebSeAl  is  the  work 
presented  in  [21],  It  uses  a  modified  web  browser  to  per¬ 
form  routing  decisions  at  the  client  side.  The  browser 
downloads  an  applet  which  the  service  provider  needs  to 
implement  to  realize  service  specific  routing.  This  ap¬ 
proach  creates  increased  network  traffic  due  to  applet  trans¬ 
mission  and  potential  control  messages  between  the  applet 
and  the  servers. 
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3  WebSeAl  Architecture 

A  distributed  web  site  (fig.  1)  consists  of  a  set  of  servers 
S i  . . .  Sn,  each  with  its  own  IP  number  I Pi . . .  IPn.  One 
of  these  servers  is  known  to  the  standard  DNS  system  by 
the  logical  address  of  the  original  single  server.  We  as¬ 
sume  that  the  service  content  is  replicated  and  that  each 
server  knows  about  the  IP  numbers  of  all  individual  servers 
comprising  the  distributed  service.  This  might  be  achieved 
through  mirroring  or  with  a  distributed  file  system  [8,  18]. 
Except  for  the  bootstrapping  phase,  all  replicas  are  treated 
equally,  and  as  long  as  any  one  replica  is  responsive,  clients 
will  be  able  to  access  the  service. 


Figure  1:  A  distributed  web  site. 

In  WebSeAl,  clients  are  responsible  for  routing  individ¬ 
ual  requests  to  different  servers  comprising  a  distributed 
web  site.  This  functionality  is  provided  by  a  client  agent 
module.  In  the  basic  architecture  of  the  system,  one  agent 
is  associated  with  each  client.  The  client  agent: 

•  intercepts  the  requests  generated  by  the  local  client; 

•  has  address  information  about  the  individual  servers; 

•  collects  dynamic  performance  data  (e.g.  network  con¬ 
ditions,  server  load,  and  other  site  specific  informa¬ 
tion); 


A  serx’er  agent  module  located  on  each  server  host  pro¬ 
vides  address  information  to  the  client  agent.  It  also  com¬ 
municates  other  site  specific  information  which  might  be 
used  to  control  access  to  the  server  pool,  to  support  charg¬ 
ing  for  services,  and  so  forth,  as  will  be  discussed  in  Sec¬ 
tion  4. 

In  the  remainder  of  this  section,  we  will  first  discuss 
WebSeAl's  routing  strategies,  then  present  how  logical 
names  of  distributed  web  sites  are  resolved,  and  conclude 
with  implementation  specific  issues. 

3.1  Routing  Strategies 

The  combination  of  the  stateless  nature  of  HTTP  and 
the  fact  that  many  web  pages  contain  several  images  and 
frames  result  in  the  generation  of  several  requests  to  re¬ 
trieve  a  single  web  page.  WebSeAl  client  agents  measure 
the  total  response  time  for  each  such  request.  The  total 
response  time  measured  is  the  complete  end-to-end  delay 
which  includes  connection  establishment,  network  delay, 
and  server  time.  WebSeAl  client  agents  strive  to  minimize 
this  total  delay. 

Each  client  agent  makes  routing  decisions  based  on  the 
average  response  time  of  each  server.  These  averages  are 
estimated  using  the  measured  response  times  for  the  N 
most  recent  requests.  The  updated  routing  strategy  is  used 
to  direct  the  next  N  requests  to  the  appropriate  servers  in 
the  pool.  Alternatively,  the  client  agent  could  estimate  the 
average  response  times  by  sending  occasional  probes  at  the 
cost  of  increased  network  traffic.  One  of  the  main  design 
goals  of  WebSeAl  is  to  avoid  control  traffic,  so  we  decided 
against  this  approach. 

One  possible  routing  strategy  client  agents  could  em¬ 
ploy  is  to  always  contact  the  most  responsive  server.  Rout¬ 
ing  all  requests  to  a  single  server,  however,  will  fail  to  col¬ 
lect  new  performance  data  for  the  slower  servers.  Instead, 
we  use  probabilistic  routing  to  ensure  that  client  agents 
collect  new  performance  data  for  all  servers.  More  specifi¬ 
cally,  if  Tm  j  denotes  the  average  response  time  for  requests 
routed  from  client  m  to  server  i,  then  routing  of  the  next  N 
requests  is  based  on  the  probability  distribution: 


•  makes  routing  decisions  based  on  this  information; 

•  forwards  the  request  to  the  selected  server,  receives 
the  response,  and  delivers  it  to  the  client; 

•  transparently  redirects  the  request  to  an  alternate 
server  if  the  selected  server  is  not  responsive. 


Pmi  — 


(1) 


where  k  >  0  is  a  constant.  With  k  =  0,  requests  are 
routed  to  the  servers  randomly,  without  taking  into  account 
their  performance.  With  k  =  1,  we  can  achieve  linear 
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distribution.  This  will  favor  fast  machines  while  still  us¬ 
ing  slower  ones.  However,  the  overall  performance  might 
suffer  due  to  possibly  long  delays  from  slow  servers.  By 
raising  k,  more  requests  will  be  routed  to  the  most  respon¬ 
sive  servers.1  Very  high  routing  probabilities  for  the  fastest 
servers  will  cause  very  infrequent  usage  of  slower  ones, 
which  in  turn  will  decrease  the  potential  to  quickly  detect 
improved  servers.  WebSeAl  imposes  a  minimum  threshold 
to  circumvent  this. 

In  our  current  implementation,  we  base  routing  deci¬ 
sions  only  on  the  most  recent  N  measurements.  We  are 
considering  several  alternate  strategies  two  of  which  are: 

•  Weighted  Average:  When  calculating  the  perfor¬ 
mance  estimate  of  a  replica,  more  recent  data  should 
impact  the  overall  performance  more  than  older  data, 
and  the  estimates  should  be  updated  more  frequently. 

•  Time-of-Day:  Network  conditions  and  server  usage 
vary  with  the  time-of-day  or  the  day  of  the  week  [4, 
9],  and  this  information  should  be  considered  in  the 
routing  strategy. 

WebSeAl  allows  different  clients  to  use  different  routing 
strategies.  We  plan  to  experiment  with  various  strategies 
and  to  investigate  how  each  one  and  various  combinations 
perform  in  different  settings.  Our  goal  is  to  realize  a  set 
of  routing  strategies  and  to  adapt  dynamically  to  changing 
conditions. 

3.2  Name  Resolution 

A  server  is  identified  by  a  logical  address  in  the  form 
of  a  hostname.  When  a  client  attempts  to  contact  a  server, 
the  DNS  system  transparently  resolves  the  hostname  to  an 
IP  number,  which  is  successively  used  to  establish  the  con¬ 
nection.  To  contact  a  distributed  server  in  a  transparent 
fashion,  a  one-to-many  mapping  from  the  hostname  to  one 
of  the  IP  numbers  of  the  replicated  machines  is  needed. 
WebSeAl  pushes  this  name  resolving  functionality  onto  the 
client  agents. 

Client  agents  maintain  a  cache  of  logical  hostnames 
and  corresponding  IP  numbers  to  perform  the  mapping  us¬ 
ing  address  information  provided  by  server  agents.  When 
a  client  agent  attempts  to  access  a  distributed  server  for 
which  it  does  not  have  a  mapping  cached,  it  uses  standard 

1  As  k  approaches  infinity,  all  requests  will  be  routed  to  the  most  re- 
sponsive  machines. 


DNS  name  resolving  and  contacts  the  server  agent  at  that 
logical  address.  The  server  agent  uses  the  local  web  server 
to  generate  the  response  and  includes  the  addresses  of  the 
individual  server  agents  in  the  response.  The  client  agent 
extracts  the  addresses  from  the  response  and  creates  an  en¬ 
try  in  its  cache.  Future  requests  to  this  distributed  server 
use  this  information  to  perform  a  one-to-many  mapping 
from  the  logical  address  to  the  individual  hosts.  The  stan¬ 
dard  DNS  system  is  used  only  for  bootstrapping — once  a 
mapping  for  a  logical  address  is  cached,  the  DNS  system 
is  not  needed  to  access  any  of  the  replicas. 

Client  agents  need  to  retrieve  the  addresses  of  the  server 
agents  only  to  create  an  initial  entry  or  to  refresh  their 
cache  if  the  address  list  has  changed  in  any  way.  To  avoid 
unnecessary  transmission  of  address  information,  client 
agents  include  a  timestamp  in  their  requests  which  indi¬ 
cates  the  state  of  the  currently  cached  mapping  for  the 
given  distributed  server.  Upon  receipt  of  a  request,  each 
server  agent  inspects  this  timestamp  and  includes  the  ad¬ 
dresses  in  the  response  only  if  more  up-to-date  address  in¬ 
formation  is  available.  This  is  very  similar  in  nature  to 
the  If-modif ied-since  header  [2],  which  is  used  to 
avoid  retrieving  cached  files  which  have  not  been  modified 
since  a  certain  date. 

HTTP  allows  application  specific  header  fields  and  re¬ 
quires  that  all  intermediaries  such  as  proxies  or  gate¬ 
ways  conforming  to  HTTP  ignore  these  and  forward 
them  unchanged.  We  utilize  this  to  “piggyback”  times¬ 
tamps  and  addresses  in  HTTP  messages.  WebSeAl  in¬ 
troduces  two  new  message  headers:  Replica-Date 
and  Replica-Addresses.  Client  agents  use  the  first 
header  to  tell  server  agents  the  status  of  their  cached  ad¬ 
dresses  for  the  distributed  server  at  hand.  Servers  use  both 
headers  to  return  a  list  of  addresses  and  the  timestamp  at 
which  this  information  was  updated. 

Mapping  a  logical  hostname  to  a  set  of  IP  numbers 
shares  many  similarities  with  DNS  based  and  server  side 
approaches  described  in  the  previous  section.  Notice  that 
these  approaches  require  that  the  servers  on  all  replicated 
hosts  accept  connections  at  the  same  port.  Also,  the  di¬ 
rectory  structure  must  be  identical  on  each  host.  Web- 
SeAl’s  architecture  relaxes  these  restrictions.  The  map¬ 
ping  from  hostname  to  IP  numbers  can  be  easily  extended 
to  a  mapping  from  hostname  and  port  to  IP  number  and 
port  to  accommodate  usage  of  different  port  numbers. 
This  requires  that  the  address  information  included  in  re- 
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sponses  be  extended  to  contain  port  numbers  as  well  as 
hostnames.  Path  offsets  can  be  accommodated  similarly. 
For  example,  www .  yahoo  .  com :  8  0  /  can  be  mapped  to 
www .  cs  .  nyu  .  edu  :  8  8  8  8  /yahoo/.  On  the  first  host, 
the  server  is  accepting  connections  at  port  80  and  the  di¬ 
rectory  structure  is  rooted  at  /.  On  the  second  host,  the 
server  accepts  connections  at  port  8888  and  the  root  direc¬ 
tory  is  at  /yahoo/.  Many  mirror  sites  use  different  root 
directories  and  require  a  relative  path  offset. 

3.3  Implementation  Issues 

WebSeAl  requires  a  server  agent  module  at  the  server 
side  (fig.  2).  This  functionality  could  be  added  to  exist¬ 
ing  web  servers  quite  easily  and  should  impose  only  little 
computational  overhead.  However,  to  create  a  usable  sys¬ 
tem  without  having  to  modify  existing  servers,  WebSeAl 
provides  a  stand-alone  Java  application  which  implements 
the  server  agent  functionality.  It  intercepts  every  incom¬ 
ing  request,  forwards  it  to  the  local  web  server,  accepts  the 
response,  adds  the  address  information  to  the  response  as 
needed,  and  forwards  it  to  the  client  agent. 


Figure  2:  A  distributed  web  site  with  WebSeAl  client 
agents  and  server  agents. 

The  client  agent  module  is  somewhat  more  complex, 
but  it  should  be  fairly  straightforward  to  extend  existing 
web  browsers  to  support  this  functionality.  Similar  to  the 
server  agent,  WebSeAl  provides  a  stand-alone  Java  appli¬ 
cation  which  realizes  the  client  side  functionality  in  order 
to  provide  a  usable  system  without  having  to  modify  ex¬ 
isting  clients.  We  utilize  the  fact  that  virtually  all  browsers 
support  proxies  to  intercept  requests.  When  the  client  agent 
is  started  up,  it  creates  a  server  socket  which  accepts  HTTP 


requests,  very  much  like  a  proxy  does.  By  configuring  the 
browser  to  use  the  “proxy”  (i.e.,  WebSeAl  client  agent),  the 
client  agent  effectively  intercepts  each  request. 

Proxies  are  generally  used  to  allow  Internet  access 
through  firewalls  and  perform  caching  of  web  documents. 
WebSeATs  client  agent  can  accommodate  proxies  in  two 
ways.  First,  a  client  agent  can  be  located  between  one  or 
more  clients  and  a  proxy.  Since  name  resolution  is  per¬ 
formed  at  the  client  agent,  the  proxy  will  treat  identical 
documents  from  different  replicas  of  the  same  distributed 
server  as  different  documents  and  create  redundant  copies 
in  its  cache.  Alternatively,  the  client  agent  can  be  located 
“behind”  the  proxy.  This  configuration  avoids  the  problem 
of  redundant  copies  in  the  proxy  cache.  Also,  only  one  ad¬ 
dress  cache  and  a  single  set  of  statistical  data  is  maintained 
for  a  number  of  users,  resulting  in  more  up-to-date  address 
caches  and  more  accurate  estimates. 

Both  WebSeATs  server  and  client  agent  functionality 
should  ideally  be  included  in  web  servers  and  browser  or 
proxies.  We  provide  client  and  server  agents  to  enable  ser¬ 
vice  providers  and  users  to  take  advantage  of  this  technol¬ 
ogy  without  the  need  to  modify  existing  systems.  Inde¬ 
pendent  of  whether  agents  are  used  or  existing  systems  are 
modified,  for  a  system  like  WebSeAl  to  gain  wide  accep¬ 
tance  it  needs  to  be  backward  compatible  with  regard  to 
clients  and  servers  lacking  this  functionality.  WebSeAl  is 
backward  compatible  and  supports  gradual  infiltration: 

•  WebSeAl  Client  and  Standard  Server:  A  stan¬ 
dard  HTTP  server  is  required  to  ignore  the  timestamp 
header  in  a  request  from  a  WebSeAl  client  agent  and 
will  service  the  request  as  usual.  The  lack  of  address 
information  in  the  response  indicates  to  the  client 
agent  that  it  is  dealing  with  a  standard  server.  It  can 
react  to  this,  for  example,  by  infrequently  including 
the  timestamp  in  its  future  requests  in  order  to  update 
its  cache  in  case  this  site  is  upgraded. 

•  Standard  Client  and  WebSeAl  Server:  A  request 
received  by  a  server  agent  will  not  contain  a  time- 
stamp  header  if  the  client  lacks  WebSeAl  function¬ 
ality.  The  server  can  react  to  this  in  several  ways;  two 
possibilities  are:  (1)  it  services  the  request  in  a  stan¬ 
dard  manner  without  including  any  address  informa¬ 
tion  in  its  response;  (2)  it  routes  the  request  on  behalf 
of  the  the  client  to  individual  servers. 
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4  Management  of  the  Server  Pool 

WebSeAl  client  agents  are  noncooperative  in  the  sense 
that  they  make  their  routing  decisions  independently  from 
each  other,  striving  to  optimize  their  individual  perfor¬ 
mance.  While  each  client  can  implement  a  routing  strategy 
of  its  choice,  in  the  current  design  client  agents  route  re¬ 
quests  to  servers  with  minimal  average  response  time.  The 
operating  point  of  the  system — i.e.,  the  load  distribution 
over  the  server  pool — is  therefore  solely  the  result  of  the 
interaction  among  the  various  distributed  client  agents  and 
cannot  be  controlled  by  the  service  provider.  In  this  sec¬ 
tion,  we  will  discuss  strategies  that  can  be  used  at  the  server 
side  to  control  the  operating  point  of  the  system  while  the 
client  agents  make  their  routing  decisions  in  a  noncooper¬ 
ative  manner. 

The  service  provider  aims  at  distributing  the  load  cur¬ 
rently  offered  to  the  server  pool  in  a  way  that  is  deemed 
efficient  from  the  system ’s  point  of  view.  The  provider,  for 
instance,  might  desire  an  operating  point  that  minimizes 
the  overall  average  response  time  of  the  server  pool.  In 
other  cases,  the  provider  might  want  to  discourage  usage 
of  certain  machines — even  if  they  are  the  most  responsive 
ones — in  order  to  perform  other  site  specific  tasks.  There¬ 
fore,  a  mechanism  is  needed  to  make  the  distributed  client 
agents  implement  routing  strategies  which  lead  to  an  oper¬ 
ating  point  that  coincides  with  the  desired  one. 

The  problem  of  managing  the  behavior  of  systems 
where  control  is  distributed  and  noncooperative  is  a  funda¬ 
mental  one.  The  interaction  among  the  various  distributed 
controllers  (client  agents  in  WebSeAl)  can  be  modeled  as 
a  game ,  and  Game  Theory  provides  the  systematic  frame¬ 
work  to  study  and  analyze  the  behavior  of  such  systems — 
for  an  overview  of  game  theoretic  aspects  in  computer  net¬ 
working  see  [11]  and  references  therein.  The  operating 
points  of  the  system  are  the  Nash  equilibria  of  the  underly¬ 
ing  control  game.  Noncooperative  equilibria  are  inherently 
inefficient:  while  each  controller  strives  to  optimize  its  in¬ 
dividual  performance,  the  overall  behavior  of  the  system 
is,  generically,  suboptimal. 

WebSeAl  uses  a  pricing  mechanism  to  provide  incen¬ 
tives  to  the  noncooperative  client  agents  to  implement  rout¬ 
ing  strategies  that  lead  to  the  desired  load  distribution  over 
the  server  pool.  The  methodology  is  motivated  by  re¬ 
cent  analytical  studies  in  the  area  of  networking  which 
have  shown  that  a  network/service  provider  can  enforce 


any  desired  operating  point  by  means  of  appropriate  pric¬ 
ing  strategies  [13,  14].  The  key  idea  in  WebSeAFs  pric¬ 
ing  mechanism  is  that  there  is  a  service  cost  associated 
with  obtaining  service  from  each  server  in  the  pool.  Client 
agents  are  now  making  their  routing  decisions  based  not 
only  on  performance  statistics,  but  also  on  service  cost  in¬ 
formation  for  each  server.  The  main  assumption  behind 
this  mechanism  is  that  the  client  agents  are  indeed  “sen¬ 
sitive”  to  service  costs.  This  behavior  is  expected  in  pri¬ 
vate  Intranets  where  client  agents  and  the  pricing  mech¬ 
anism  are  part  of  the  same  management  system.  For  ex¬ 
ternal  client  agents  accessing  the  web  site,  this  behavior 
can  be  enforced  by  actual  usage-based  service  charges  (for 
commercial  web  sites),  or  by  means  of  limited  electronic 
budget  allocated  to  each  client — an  architecture  developed 
according  to  these  ideas  is  proposed  in  [12],  When  client 
agents  are  sensitive  to  service  costs,  the  service  provider 
can  control  not  only  the  load  distribution  over  the  available 
servers,  but  also  the  total  offered  load  itself. 

To  support  pricing  functionalities  in  WebSeAl,  each  dis¬ 
tributed  web  site  is  equipped  with  a  pricing  manager  mod¬ 
ule.  Based  on  the  targeted  operating  point,  the  price  man¬ 
ager  determines  the  service  costs  to  access  each  server  and 
communicates  it  to  the  corresponding  agent.  The  server 
agent  provides  pricing  information  about  the  server  to  the 
client  agents  that  receive  service  from  it.  In  the  remainder 
of  this  section,  we  will  first  discuss  the  pricing  strategies 
in  the  current  design  of  WebSeAl  and  then  address  some 
implementation  issues. 

4.1  Pricing  Strategies 

The  goal  of  the  pricing  mechanism  in  WebSeAl  is  two¬ 
fold: 

•  Avoidance  of  congestion  (overload  conditions)  at  var¬ 
ious  servers. 

•  Load  balancing — that  is,  distribution  of  the  total  load 
offered  to  the  web  site  among  the  available  servers  in 
a  way  that  is  deemed  efficient  by  the  provider. 

The  pricing  strategies  in  the  current  version  of  WebSeAl 
are  based  on  analytical  results  in  [14].  That  study  considers 
a  system  of  general  network  resources  accessed  by  a  num¬ 
ber  of  noncooperative  clients.  Each  resource  is  character¬ 
ized  by  its  “capacity,”  that  is,  the  maximum  load  that  can 
be  accommodated  by  the  resource.  Congestion  pricing  is 
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proposed  as  a  means  for  avoiding  overload  conditions:  the 
service  cost  per  size  unit  (i.e.,  the  price)  of  each  resource 
is  proportional  to  the  congestion  level  at  the  resource  that 
depends  on  the  total  load  offered  to  it  by  the  clients.  More 
specifically,  the  price  of  each  resource  is  given  by  the  con¬ 
gestion  function  associated  with  the  resource  multiplied  by 
a  weight  factor.  These  weights  determine  the  relative  sen¬ 
sitivity  of  the  clients  to  the  congestion  level  at  the  various 
resources  and  will  be  referred  to  as  the  discount  factors. 
Load  balancing  can  be  achieved  by  appropriate  choice  of 
these  discount  factors.  This  pricing  strategy  is  shown  to 
allow  the  provider  to  enforce  any  desired  operating  point 
while  the  clients  make  their  routing  decisions  noncoopera- 
tively. 

Along  the  lines  of  these  analytical  results,  the  pric¬ 
ing  strategy  in  the  current  design  of  WebSeAl  is  based 
on  determining  a  discount  factor  for  each  server  in  the 
pool,  which  determines  the  relative  sensitivity  of  the  client 
agents  to  the  responsiveness  of  the  server.2  In  particular, 
the  performance  metric  considered  by  each  client  agent  in 
making  its  routing  decisions  is  the  average  response  time  of 
each  server  multiplied  by  the  corresponding  discount  fac¬ 
tor.  Therefore,  if  Wi  is  the  discount  factor  of  server  i.  and 
Tm ,  the  average  response  time  from  the  server  to  client 
agent  m,  then  the  routing  strategy  of  the  client  agent  de¬ 
scribed  by  eq.  1  becomes: 


Pmi  — 


1  /(wjTmi)k 

V  ( wjTmj)k 


(2) 


4.2  Implementation  Issues 

The  server  discount  factors  are  determined  by  the  pric¬ 
ing  manager  based  on  the  operating  point  that  the  provider 
wants  to  enforce.  One  way  to  determine  these  factors  is 
to  map  the  parameters  of  the  model  considered  in  [14]  to 
the  characteristics  of  WebSeAl  and  apply  the  correspond¬ 
ing  analytical  results,  expecting  to  achieve  a  good  approx¬ 
imation  of  the  desired  operating  point.  Instead,  we  chose 
to  use  an  adaptive  algorithm,  also  proposed  in  [14],  which 
does  not  depend  on  the  details  of  the  underlying  analyti¬ 
cal  model.  The  algorithm  updates  the  discount  factors  it¬ 
eratively,  based  on  the  “distance”  of  the  current  operating 
point  from  the  desired  one. 

If  ff  denotes  the  desired  load  at  server  i  and  /)  in)  the 
actual  load  offered  to  the  server  during  the  n-th  iteration, 

2Note  that  the  discount  factor  of  each  server  is  the  same  for  all  clients. 


then  its  discount  factor  Wi  is  updated  using  the  following: 

Wi(n  +  1)  =  Wi(n)eei^*  ,  (3) 

where  9i  >  0  is  a  constant  that  determines  the  rate  of 
change  in  the  discount  factor  of  server  i.  The  idea  behind 
this  iterative  scheme  is  that,  if  the  server  is  currently  re¬ 
ceiving  less  load  than  the  desired  one,  its  discount  factor 
should  be  decreased.  This  decreases  the  clients’  sensitivity 
to  the  congestion  level  at  the  server,  thus  encouraging  them 
to  direct  more  of  their  requests  to  it.  Similarly,  if  the  server 
receives  more  load  than  the  desired  one,  its  discount  factor 
is  increased.  Under  a  set  of  general  assumptions  guarantee¬ 
ing  that  the  client  population  as  a  total  reacts  “rationally” 
to  price  changes,  this  iterative  scheme  was  shown  in  [14] 
to  drive  the  system  to  the  desired  operating  point. 

In  the  current  implementation  of  WebSeAl,  server  load 
is  expressed  in  requests  per  unit  of  time.  Considering 
HTTP  requests,  we  expect  that  each  client  generates  a  large 
number  of  requests,  each  of  small  to  moderate  size.  There¬ 
fore,  this  is  a  satisfactory  approximation.  A  more  precise 
load  metric  would  consider  the  actual  size  of  each  request 
and  will  be  incorporated  in  future  implementations. 

The  pricing  manager  periodically  collects  information 
about  the  load  offered  to  each  server  by  contacting  the  cor¬ 
responding  server  agent,  updates  the  discount  factors  ac¬ 
cording  to  iteration  3  and  communicates  them  to  the  server 
agents.  Each  agent  receives  only  the  update  of  its  associ¬ 
ated  server  and  is  responsible  for  advertising  it  to  the  client 
agents.  This  is  achieved  by  piggybacking  the  discount  fac¬ 
tor  of  the  server  to  HTTP  messages  that  contain  the  re¬ 
sponses  to  the  clients’  requests. 

Iteration  3  indicates  that  the  discount  factor  of  each 
server  is  determined  using  only  local  information,  namely, 
the  difference  between  the  load  currently  offered  to  it  and 
the  targeted  one.  Therefore,  the  adaptive  algorithm  is  well 
suited  for  distributed  implementation:  if  the  server  agent 
is  cognizant  of  the  desired  load  at  the  server  (/*  in  eq.  3), 
then  it  can  update  the  discount  factor  of  the  server  with¬ 
out  contacting  the  pricing  manager.  Note,  however,  that 
the  target  load  (in  requests  per  time  unit)  typically  depends 
on  the  total  load  offered  to  the  web  site,  information  that 
is  only  available  to  the  pricing  manager.  If  the  total  of¬ 
fered  load  is  not  expected  to  change  dramatically,  the  pric¬ 
ing  manager  can  inform  the  server  agents  about  their  target 
load  less  frequently.  Then,  each  server  agent  can  use  itera- 
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tion  3  to  update  the  server’s  discount  factor  on  a  faster  time 
scale. 

5  Experiments 

In  this  section,  we  will  present  initial  performance  re¬ 
sults.  For  the  first  three  tests,  we  used  ten  mirror  sites  of  a 
popular  software  archive  which  repeatedly  appears  in  [19] 
as  one  of  the  most  accessed  web  sites.  These  tests  were 
conducted  under  real  world  conditions,  using  standard  ma¬ 
chines,  networks,  and  software.  The  ten  servers  were  lo¬ 
cated  on  six  continents:  two  each  in  North  America,  South 
America,  Europe,  and  Asia,  and  one  each  in  Africa  and 
Australia.  The  client  was  running  at  New  York  University. 
Geographically,  the  closest  server  to  the  client  was  located 
in  Massachusetts,  the  second  closest  in  California. 

The  client  running  five  threads  generated  1000  requests 
for  a  file  of  length  4253  bytes.  All  requests  were  addressed 
to  a  single  logical  address.  A  local  client  agent  inter¬ 
cepted  each  request  and  provided  transparent  access  to  a 
distributed  web  site.  Since  we  experimented  with  existing 
web  sites  not  running  WebSeAl’s  server  agent,  we  added 
the  server  addresses  manually  into  the  cache  of  the  client. 

For  the  first  experiment,  we  ran  two  tests:  one  us¬ 
ing  WebSeAl's  client  agent  and  one  contacting  the  closest 
server  directly.  Using  the  client  agent,  the  total  response 
time  for  1000  requests  was  291.6  s.  The  response  time  we 
measured  is  the  end-to-end  delay  which  includes  connec¬ 
tion  establishment,  network  delay,  and  server  time.  95.4% 
of  the  requests  were  serviced  by  the  closest  server.  The  to¬ 
tal  response  time  for  contacting  the  closest  server  directly 
was  266.9  s.  This  translates  to  an  overhead  of  9.2%.  The 
fact  that  the  WebSeAl  client  agent  sent  the  vast  majority  of 
the  requests  to  the  closest  server  indicates  that  this  server 
was  delivering  the  best  performance.  Besides  the  computa¬ 
tional  and  communication  overhead  of  the  client  agent,  an 
important  factor  contributing  to  this  overhead  is  that  4.6% 
of  the  requests  were  routed  to  slower  servers  to  update  per¬ 
formance  data  for  these  machines.  As  mentioned  before, 
this  could  be  avoided  by  occasionally  sending  probes  at 
the  cost  of  generating  additional  traffic. 

In  our  second  experiment,  we  used  the  same  setup  as 
before,  but  ran  the  experiment  at  a  different  time  of  the 
day.  This  time,  only  3.9%  of  the  requests  were  serviced 
by  the  closest  site.  The  total  response  time  was  761.4  s 
as  opposed  to  1295.3  s  when  contacting  the  closest  host 


directly — an  improvement  of  41.2%.  These  two  experi¬ 
ments  indicate  that  WebSeAl  can  deliver  significant  per¬ 
formance  gains  while  imposing  only  little  overhead,  com¬ 
pared  to  the  scenario  when  the  user  is  able  to  always  pick 
the  fastest  machine. 

The  third  experiment  investigates  how  WebSeAl  client 
agents  adapt  to  the  dynamic  performance  changes  of  indi¬ 
vidual  servers.  As  with  the  previous  two  experiments,  the 
client,  using  a  local  client  agent,  generated  1000  requests 
to  a  logical  address.  After  300  requests,  we  started  down¬ 
loading  several  large  files  from  the  fastest  site,  which  hap¬ 
pened  to  be  the  geographically  closest  one,  thus  generating 
additional  load  at  that  server.  This  traffic  was  discontinued 
after  another  300  requests.  Of  the  first  300  requests,  93.3% 
were  serviced  by  the  closest  server.  This  percentage  sank 
to  1 1 .6%  for  the  next  300  requests,  and  went  up  again  to 
93.2%  for  the  last  400  requests.  The  second  closest  server 
received  initially  2.0%  of  the  requests,  which  increased  to 
69.3%  when  the  performance  of  the  closest  server  started 
to  degrade.  This  indicates  that  WebSeAl  adapts  well  to 
performance  changes  in  the  server  pool  (fig.  3). 


Update 

Figure  3:  Request  distribution  in  a  dynamically  changing 
environment. 

For  our  last  experiment,  we  used  several  identical  ma¬ 
chines  in  a  controlled  environment  to  show  how  WebSeAl 
reacts  to  changes  in  the  server  pool.  On  each  of  four  ma¬ 
chines,  we  started  a  WebSeAl  server  agent  and  a  standard 
web  server.  We  first  used  three  servers,  added  another  one 
after  about  300  requests,  and  removed  one  of  the  original 
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three  servers  after  another  400  requests3.  Since  we  used 
identical  machines,  it  can  be  expected  that  the  two  fully 
available  machines  would  each  get  300  requests,  the  other 
two  each  200  requests.  The  actual  distribution  was  295 
and  286  requests  for  the  first  two  machines,  and  225  and 
194  requests  for  the  other  two.  This  illustrates  that  Web¬ 
SeAl  quickly  and  effectively  accommodates  changes  in  the 
server  pool. 

6  Conclusions 

WebSeAl  is  a  novel  architecture  for  managing  resources 
of  web  sites  consisting  of  a  pool  of  replicated  servers.  Un¬ 
like  most  existing  proposals,  in  WebSeAl  it  is  the  respon¬ 
sibility  of  the  clients  to  route  their  requests  to  individual 
servers.  This  architecture  scales  well  with  the  number  of 
users,  delivers  flexible  quality  of  service,  and  provides  fault 
masking. 

We  proposed  routing  strategies  for  directing  client  re¬ 
quests  to  the  most  responsive  servers.  Unlike  server  side 
approaches,  routing  decisions  are  based  not  only  on  server 
load,  but  also  on  network  traffic  conditions.  We  also  dis¬ 
cussed  strategies  that  can  be  used  at  the  server  side  to 
induce  efficient  allocation  of  resources  (load  balancing) 
while  clients  make  their  routing  decisions  in  a  noncoopera¬ 
tive  manner.  Motivated  by  recent  studies  on  game-theoretic 
aspects  of  networking,  we  proposed  a  pricing  mechanism 
that  provides  incentives  to  the  clients  to  route  their  requests 
in  a  way  that  is  deemed  efficient  by  the  service  provider. 

A  prototype  system  based  on  this  architecture  has 
been  implemented  and  its  functionality  has  been  validated 
through  a  series  of  experiments.  These  results  indicate  that 
WebSeAl  can  deliver  significant  performance  gains  while 
imposing  minimal  overhead. 
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